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(54) Qualified priority queue scheduler 

(57) A queue scheduling method and apparatus for 
a queuing structure including a plurality of queues at dif- 
ferent priorities and having different bandwidth privileg- 
es competing for the bandwidth of an output. The meth- 
od selects a queue at the current priority, having data 
for release to the output and having available bandwidth 
to release data to the output. If there are a plurality of 



such queues, round-robin ordering detemnines which 
queue in the plurality is selected. The apparatus in- 
cludes a plurality of bit masks representing the status In 
relation to a queue state variable of the different queues, 
which masks are combined in a bit-wise AND operation 
to determine which queues are selectable for releasing 
data to the output. 



FIGURE 1 




Printed by Jouve. 75001 PAHIS (FR) 

9/13/2006, EAST Version: 



2.1.0.14 



EP1 130 877 A2 



Description 

BACKGROUND OF THE INVENTION 

[0001] Data communication switches receive packets 

on a plurality of inputs and transmit them on a plurality 
of outputs. Sometimes, packets come in on two or more 
inputs destined for the same output. To minimize colli- 
sions between such packets, queueing is used. Queues 
temporarily store some packets while others are con> 
suming the bandwidth necessary for delivery of the 
queued packets to an output. 

[0002] The release of packets from queues compet- 
ing for output bandwidth is typically coordinated by a 
scheduler. Many schedulers honor a strict rule of priority 
scheduling, that is, queues are assigned different prior- 
ity levels and those associated with relatively high pri- 
ority levels are allowed to release packets before those 
associated with relatively low priority levels. Strict prior- 
ity scheduling allows flows which have been designated 
as relatively time-critical to take precedence over those 
which have been designated as relatively less time crit- 
ical. 

[0003] Despite its general efficacy, strict priority 
scheduling is not problem-free. One undesirable effect 
is priority blocking. Priority blocking occurs when deliv- 
ery of packets from relatively low priority queues suffer 
a sustained delay due to servicing of higher priority 
queues. Priority blocking can eventually cause packets 
in the blocked queues to be overwritten while awaiting 
release, a condition known as dropping. Dropping un- 
desirably disrupts the flows to which the dropped pack- 
ets relate. 

[0004] While priority scheduling is by design sup- 
posed to provide faster service to some queues than 
others, departure from a strict rule of prioritization may 
be warranted in certain cases. For instance, priority 
blocking may be caused by "stuffing" of a single queue 
at the priority level receiving service with non-critical or 
even redundant (e.g., in the case of an unresolved span- 
ning tree loop) traffic. Strict adherence to priority sched- 
uling would allow this "stuffing" to in effect hold all of the 
output bandwidth hostage at the expense of the relative- 
ly low priority queues. 

[0005] Accordingly, there is a need for a qualified pri- 
ority scheduling method for reducing priority blocking 
through the imposition of reasonable limitations on a 
general policy of prioritization. There is also a need for 
efficient scheduling hardware to effectuate the qualified 
priority scheduling. 

SUMMARY OF THE INVENTION 

[0006] The present invention provides a qualified pri- 
ority scheduling method and apparatus for a data queu- 
ing structure including a plurality of groups of queues 
competing for the bandwidth of an output. Each group 
has at least one queue and is associated with a different 
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priority level. Moreover, at least one queue within at 
least one group is a restricted bandwidth queue, i.e. has 
a bandwidth contract which cannot be exceeded. 
[0007] In accordance with the queue scheduling 
5 method, a queue within the group associated with the 
cun-ent priority level which (i) has data for release to the 
output (ii) has available bandwidth, i.e. has not violated 
its bandwidth contract (If any), is selected to release da- 
ta to the output. If there are a plurality of such queues, 
10 round-robin ordering detennlnes which queue in the plu- 
rality is selected. The current priority level is initially set 
at the highest priority level and Is decremented until a 
queue is selected or all priority levels have been 
checked, whichever occurs first. Queues may have re- 
is stricted or unrestricted bandwidth. Unrestricted band- 
width queues always have available bandwidth, where- 
as restricted queues have available bandwidth only if 
they have credit. Credit is allocated to restricted queues 
periodically and is reduced in accordance with the 
length of released data. Data are released in quanta of 
credit. 

[0008] In accordance with the queue scheduling ap- 
paratus, a selector is arranged for selecting a queue 
from the plurality of queues as a function of a plurality 
of queue state variables and transmitting queue selec- 
tion infomiation. The selector preferably includes a plu- 
rality of bit masks representing the status in relation to 
a queue state variable of the different queues, which 
masks are combined In a bit-wise AND operation to de- 
termine which queues are selectable for releasing data 
to the output. The selector preferably further includes 
an arbiter for receiving the result of the determination, 
for selecting a sinigle queue round-robin from the se- 
lectable queues and for transmitting the result of the 
round-robin selection to a driver for releasing from the 
selected queue. 

[0009] These and other aspects of the present inven- 
tion may be better understood by reference to the fol- 
lowing detailed description taken in conjunction with the 
accompanying drawings briefly described below. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0010] 

Figure 1 is a policy level diagram of an exemplary 
shared-output queuing stnjcture with scheduling 
characteristics; 

Figure 2 is a block diagram of a preferred schedul- 
ing apparatus for a shared-output queuing structure 
and an associated queuing structure; 
Figure 3 is a block diagram of the multi-stage se- 
lector of Figure 2; 

Figure 4 is a block diagram of the round-robin aib'i- 
ter of Figure 2; 

Figure 5 is a block diagram of statistics of Figure 2; 
Figure 6 is a flow diagram of priority-bandwidth 
queue selection steps; 
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Figure 7 Is a flow diagram of qualifying queue se- 
lection steps of round-robin arbitration; and 
Figure 8 is a flow diagram of winning queue selec- 
tion steps of round-robin arbitration. 

DETAILED DESCRIPTION OF THE PREFERRED 

EMBODIMENT 

[001 1 ] The present invention is primarily directed to a 
method and apparatus for scheduling the release of da- 
ta from a plurality of queues competing for the band- 
width of an output. In Figure 1, an exemplary shared- 
output queuing structure with scheduling characteristics 
is shown at a policy level. In the exemplary structure, 
there are five queues 10. 20, 30, 40, 50 competing for 
the bandwidth of output 60. The release of data from 
queues 10, 20, 30, 40, 50 is scheduled, i.e. time-multi- 
plexed, to avoid contention. Queues 10, 20, 30, 40, 50 
are characterized by priority level, Including high priority 
queue 1 0, medium priority queues 20, 30 and low prior- 
ity queues 40, 50 for releasing data to output 60. Queues 
1 0, 20, 30, 40, 50 are further characterized by bandwidth 
type for releasing data to output 60. The sole high pri- 
ority queue 1 0 Is an unrestricted bandwidth queue. Both 
medium priority queues 20, 30 are restricted bandwidth 
queues. One low priority queue 40 is an unrestricted 
bandwidth queue, whereas the other low priority queue 
50 is a restricted bandwidth queue. Queues 10, 20, 30, 
40, 50 are further characterized by flow type, i.e. unicast 
or flood. However, flow type has no independent bearing 
on the order In which data are released to output 60. Of 
course, the shared-output queuing structure illustrated 
in Figure 1 is merely one example of such structure; the 
number, priority level and bandwidth type of queues will 
differ in other shared-output queuing structures opera- 
tive in accordance with the present Invention depending 
on the policies it is desired to Implement. 
[001 2] In a preferred scheduling method for a shared- 
output queuing structure, a queue within the group as- 
sociated with the current priority level which has data for 
release to the output and has available bandwidth, i.e. 
has not violated its bandwidth contract, if any, is selected 
to release data to the output. If there are a plurality of 
such queues, round-robin ordering resolves which 
queue in the plurality is selected. Various elaborations 
of this basic scheduling method are possible, as de- 
scribed hereinafter. Nevertheless, at a fundamental lev- 
el, this basic method, despite its apparent simplicity, is 
believed to confer a significant advance over the sched- 
uling methods of the prior art. 
[0013] Applying the preferred scheduling method to 
the exemplary shared-output queuing structure Illustrat- 
ed in Figure 1, if unrestricted bandwidth high priority 
queue 1 0 has data for release to output 60, queue 1 0 is 
selected. If queue 10 Is not selected, restricted band- 
width medium priority queues 20, 30 are checked. More 
particularly, if queues 20, 30 each have data for release 
to output 60, and each have available bandwidth, the 
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next-in-line one of queues 20, 30 in round-robin order 
Is selected. If queues 20, 30 each have data for release, 
but only one of queues 20, 30 has available bandwidth, 
the one of queues 20, 30 with available bandwidth is 
5 selected. If only one of queues 20, 30 has data for re- 
lease and available bandwidth, that one of queues 20, 
30 is selected. If, neither of queues 20, 30 is selected, 
low priority queues 40, 50 are checked. More particular- 
ly, if queues 40, 50 each have data for release to output 
60, and restricted bandwidth low priority queue 50 has 
available bandwidth, the next-in-line one of queues 40, 
50 in round-robin order is selected. If queues 40, 50 
each have data for release, but restricted bandwidth low 
priority queue 50 does not have available bandwidth, 
unrestricted bandwidth low priority queue 40 is selected. 
If only queue 40 has data for release, queue 40 is se- 
lected. If only queue 50 has data for release, queue 50 
Is selected if queue 50 has available bandwidth. If nei- 
ther of queues 40, 50 Is selected, no data are released 
to output 60. 

[001 4] Ref emng now to Figure 2, a preferred shared- 
output scheduling apparatus is shown at a component 
level along with an associated queue structure. Queuing 
structure 210 includes data queues 0 through N for re- 
ceiving data on respective ones of Inputs 0 through N 
21 2 and for releasing data to output 214. Release of da- 
ta from queuing structure 21 0 is controlled by scheduler 
200. Scheduler 200 includes manager 220, multi-stage 
selector 230, driver 240 and statistics 250. Manager 220 
retrieves data from statistics 250 and updates selector 
230 and statistics 250. Selector 230 includes selection 
masks 232, priority-bandwidth selector 234 and round- 
robin arbiter 236 for facilitating queue selection. Selec- 
tor 230 selects data queues for releasing data to output 
21 4 and transmits queue selection information to driver 
240. Driver 240 receives queue selection Information, 
retrieves data from statistics 250, updates statistics 250, 
and controls release of data from the selected queues. 
[0015] Refemng now to Figure 3, selector 230 is 
shown in greater detail. Selector 230 has selection 
masks 232 including enablement mask 312, backlog 
mask 31 4 bandwidth mask 31 6 and priority masks 31 8. 
Each mask maintains a status bit for each of queues 0 
through N in memory element 210. Enablement mask 
312 indicates which among queues 0 through N are en- 
abled for transmitting data to output 21 4. A "1 " In the bit 
position reserved for the queue represents enablement, 
whereas a "0" represents non-enablement. Backlog 
mask 314 Indicates which among queues 0 through N 
have data awaiting transmission to output 214. A "1" in 
the bit position reserved for the queue represents the 
presence of pending data, whereas a "0" represents the 
absence of pending data. Bandwidth mask 316 indi- 
cates which among queues 0 through N has bandwidth 
for transmitting data to output 214. A "1" represents the 
availability of bandwidth, whereas a "0" represents the 
unavailability of bandwidth. Priority masks 318 collec- 
tively indicate the priority level of queues 0 through N. 
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A separate priority mask Is maintained for each priority 
level. A "r in the bit position reserved for the queue in 
a priority mask Indicates assignment of the queue to the 
priority level which the mask represents, whereas a "0" 
indicates non-assignment to the priority level. Masks 
232 are configurable by manager 220. Backlog mask 
314 and bandwidth mask 316 are updated by manager 
220 In response to infonnatlon retrieved from statistics 
250. 

[0016] Selection masks 232 are coupled to priority- 
bandwidth selector 234. Selector 234 facilitates queue 
selection by reducing the selection to a round-robin se- 
lection among participating queues. When participant 
selection is enabled, enablement mask 312, backlog 
mask 314, bandwidth mask 316 and one of priority 
masks 318 at a time are submitted to mask compare 
324 for a bit-wise AND operation. The one of priority 
masks 318 submitted to the bit-wise AND operation is 
detemriined by priority counter 328, which instructs mul- 
tiplexor 322 to release one of priority masks 31 8 based 
on the current value of counter 328. For each participant 
selection round, counter 328 selects the highest priority 
mask first and selects priority masks of incrementally 
lower priority until winner select enablement is indicated 
at the current priority level, or all priority masks have 
been submitted. 

[0017] The result of the bit-wise AND operation per- 
formed by mask compare 324 is a mask indicating which 
■queues at the current priority level, if any, are entitled to 
participate in winner selection performed in round-robin 
arbiter 236. Particularly, the resultant mask has a "1" at 
all bit positions reserved for queues which are enabled 
for output port 214, have pending data, have available 
bandwidth and are at the current priority level, if any, 
and has a "0" at other bit positions. Thus, the resultant 
mask indicates any participating queues by a "1" and 
any non-participating queues by a "0". The resultant 
mask is submitted to winner select enable 326 for an 
OR operation. If at least one bit position in the resultant 
mask has a "1 the OR operation yields a "1 " and winner 
selection Is enabled at the current priority level. Other- 
wise, the OR operation results in a "0", winner selection 
is not enabled at the current priority level and the bit- 
wise AND operation is performed at the next priority lev- 
el, if any. The OR operation result Is supplied as feed- 
back to counter 328 to notify counter 328 whether an 
additional priority mask selection is required. It will be 
appreciated that If a "null" mask results from the bit-wise 
AND operation at all priority levels, winner selection is 
not enabled and the round-robin arbitration is not con- 
ducted. 

[0018] Priority-bandwidth selector 234 Is coupled to 
round-robin art>iter 236. Round-robin arbiter 236 re- 
solves a winning queue by making a round-robin selec- 
tion among participating queues as detemriined by se- 
lector 234. Turning to Figure 4, round-robin artDlter 236 
is shown in greater detail. When winner selection is en- 
abled, the participating queue mask, i.e. the mask re- 



sulting from the bit-wise AND operation performed in se- 
lector 234, is submitted to preliminary select array 410 
for qualifying queue determinations. Particularly, each 
preliminary select element in array 410 receives a sub- 

s set of bits from the participating queue mask and selects 
a single qualifying queue from among the participating 
queues, if any, that It represents. More particularly, the 
preliminary select element whose qualifying queue was 
selected as the winning queue by final select 420 In the 

10 last round-robin arbitration selects as a qualifying queue 
the next one of its participating queues in round-robin 
order, if any. The other preliminary select elements se- 
lect as a qualifying queue the first one of their respective 
participating queues, If any. The qualifying queue selec- 

is tlons are submitted to final select 420 along with any "no 
qualifier" notices from any preliminary select elements 
not representing any participating queues. Moreover, If 
the preliminary select element whose qualifying queue 
was selected as the last winning queue wrapped- 

20 around, i.e. returned from its last to first queue in the 
round-robin order in the course of making the selection, 
the preliminary select element submits a "wrap-around" 
notice to final select 420. Final select 420 selects as a 
winning queue the qualifying queue from the preliminary 

25 select element whose qualifying queue was selected as 
the last winning queue, unless that preliminary select 
element submitted a "no qualifier" or "wrap-around" no- 
tice. If that preliminary element submitted a "no qualifier" 
or "wrap-around" notice, final select 420 selects as the 

30 winning queue the qualifying queue from the next one 
of the preliminary select elements in round-robin order, 
If any. Final select 420 transmits the winning queue Iden- 
tifierto queue driver240 in completion of the round-robin 
arbitration. The winning queue identifier is also supplied 

35 as feedback to array 41 0 for use by preliminary select 
elements in subsequent qualifying queue selections. 
[0019] Referring now to Figure 2 in conjunction with 
Figure 5, in response to the winning queue identifier re- 
ceived from selector 230, driver 240 consults statistics 

40 250 to retrieve queue state information and controls the 
release of data from the winning queue to output 214. 
Statistics 250 maintain multi-field entries for each of 
queues 0 through N within queuing structure 21 0 includ- 
ing a queue Identifier, a head pointer, a tall pointer, cur- 

45 rent depth, maximum depth, current credit and total 
credit values. The winning queue identifier is used to 
"look-up" the corresponding entry in statistics 250. A 
predetermined quantum of credit is added to the current 
credit value for the con^esponding entry. A quantum may 

so be advantageously selected in relation to the maximum 
length of a packet formatted In accordance wrth a par- 
ticular protocol, such as 1518 bytes for an Ethernet 
packet. Driver 240 controls release of data from the win- 
ning queue to output 214 beginning with the data at the 

55 queue location Indicated by the head pointer In the cor- 
responding entry, until all data have been released from 
the winning queue, the current credit has been exhaust- 
ed or, in the case of a restricted bandwidth queue, or the 
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total credit has been exhausted, whichever occurs first. 
Driver 240 updates the head pointer, current depth, total 
credit value and current credit values for the corre- 
sponding entry as data are released to reflect the chang- 
ing queue state. The current credit value is "zeroed out" 
if all data have been released or, in the case of a restrict- 
ed bandwidth queue, the total credit is exhausted before 
the current credit. 

[0020] Manager 220 consults statistics 250 to ensure 
that selection masks 232 reflect the queue state chang- 
es made by driver 240. Current depth values in entries 
are consulted to refresh backlog mask 312 and total 
credit values in entries are consulted to refresh band- 
width mask 316. Manager 220 also periodically refresh- 
es total credit values in entries maintained for restricted 
bandwidth queues. 

[0021] Turning now to Figure 6, a flow diagram of pri- 
ority-bandwidth queue selection is shown. Participant 
selection is enabled (610) and the highest priority is se- 
lected as the current priority (620). Participating queues 
at the current priority, if any, are detennined based on 
queue enablement, backlog and bandwidth states and 
priority (630). If there is at least one participating queue 
at the current priority (640), winner selection is enabled 
and the round-robin arbiter is notified (650). If not, the 
next highest priority, If any, is selected as the current 
priority (660). If not all priorities have been checked, the 
flow returns to Step 630; otherwise, the flow Is exited 
(670). 

[0022] Turning to Figure 7, a flow diagram of qualify- 
ing queue selection in round-robin arbitration is shown 
for a representative preliminary select element. The el- 
ement receives Infomriation for the queues the element 
represents (710) and checks whether the last winning 
queue was the element's last qualifying queue (720). If 
the last winning queue is the element's last qualifying 
queue, the element selects the next (participating or 
non-participating) queue the element represents in 
round-robin order (730). If the last winning queue is not 
the element's last qualifying queue, the element selects 
the first (participating or non-participating) queue the el- 
ement represents as the current queue (740). In either 
event, the element checks whether the current queue 
has already been checked for participation in this selec- 
tion round (750). If the current queue has already been 
checked for participation, the element has no qualifying 
queue in this selection round and the final selector is 
notified (755). If the current queue has not already been 
checked for participation, it Is checked (760). If the cur- 
rent queue is a participating queue, the current queue 
Is the qualifying queue and the final selector is notified 
(765). If the current queue is not a participating queue, 
a further check is made to determine if the current queue 
is the last queue the element represents (770). If the 
current queue is the last represented queue, the ele- 
ment wraps-around, the final selector is notified and the 
flow returns to Step 740. If the current queue is not the 
last represented queue, the flow returns to Step 730. 



[0023] Turning finally to Figure 8, a flow diagram win- 
ning queue selection in round-robin arbitration is shown. 
The qualifying queues, "no qualifier" notices, if any, and 
"wrap-around" notices, if any, are received (81 0) and the 
5 preliminary select whose qualifying queue won the last 
arbitration is selected as the current element (820). A 
check is made whether the current element has a qual- 
ifying queue in this selection round (830). If the current 
element has a qualifying queue in this selection round, 
10 a further check is made whether the current element 
wrapped-around (840). If the current element did not 
wrap-around, the qualifying queue from the current ele- 
ment is the winning queue, the queue driver and ele- 
ments are notified and the flow is exited (850). If, how- 
is ever, the current element does not have a qualifying 
queue, orthe current element wrapped-around, the next 
element in round-robin order is selected as the current 
element (860). If the new current element has already 
been checked in this selection round (870), the previous 
element is reverted to as the current element and the 
flow goes to Step 850. If the new current element has 
not already been checked in this selection round, the 
flow returns to Step 830. 

[0024] It will be appreciated by those of ordinary skill 
in the art that the invention can be embodied in other 
specific forms without departing from the spirit or essen- 
tial character hereof. The present description is there- 
fore considered in all respects illustrative and not restric- 
tive. The scope of the Invention is indicated by the ap- 
pended claims, and all changes that come within the 
meaning and range of equilvatents thereof are intended 
to be embraced therein. 



Claims 

1 . A queue scheduling method for a queuing structure 
having a plurality of queues for releasing data to an 
output, including a group of one or more queues at 
a first priority and a group of one or more queues at 
a second priority, wherein the first and second pri- 
orities are different, and a queue having restricted 
bandwidth, comprising: 

checking the group of one or more queues at 
the first priority for a queue having data for re- 
lease and available bandwidth; and 
if one or more queues having data for release 
and available bandwidth is found at the first pri- 
ority, releasing data to the output from one of 
the found queues. 

2. The queue scheduling method according to claim 
1 , further comprising the step of: 

if no queue having data for release and avail- 
able bandwidth is found at the first priority, checking 
the group of queues at the second priority for a 
queue having data for release and available band- 
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width. 

3. The queue scheduling method according to claim 
1 , further comprising the step of: 

if a plurality of queues having data for release 
and available bandwidth are found at the first prior- 
ity, selecting round-robin one of the found queues 
for releasing data to the output. 

4. The queue scheduling method according to claim 
1, further comprising the step of decreasing the 
available bandwidth for the queue from which data 
are released to the output if the queue from which 
data are released to the output has restricted band- 
width. 

5. The queue scheduling method according to claim 
1 , further comprising the step of increasing credit 
for the queue from which data are released to the 
output. 

6. The queue scheduling method according to claim 
5, further comprising the step of decreasing credit 
for the queue from which data are released to the 
output In accordance with the length of the data re- 
leased. 

I, The queue scheduling method according to claim 
5, further comprising the step of periodically in- 
creasing the available bandwidth of a queue having 
restricted bandwidth. 

8. A queue selector comprising a plurality of masks, 
each mask having a plurality of bits, each bit repre- 
senting the status in relation to a queue state vari- 
able of a different queue within a plurality of queues 
coupled to an output, wherein the masks are com- 
bined in a bit-wise AND operation to detemnine 
which queues within the plurality, if any, are selecta- 
ble for releasing data to the output. 

9. The queue selector according to claim 8, wherein 
the queue state variables include backlog. 

10. The queue selector according to claim 8, wherein 
the queue state variables include bandwidth avail- 
ability. 

II. The queue selector according to claim 8, wherein 
the queue state variables Include priority. 

12. The queue selector according to claim 8, further 
comprising an ariaiter for selecting a queue for re- 
leasing data to the output from among the selecta- 
ble queues. 

13. The queue selector according to claim 12. wherein 
the selection is made round-robin. 



14. A scheduling apparatus for a queue structure hav- 
ing a plurality of queues for receiving data on re- 
spective ones of inputs and for releasing data to an 
output, comprising: 

5 

a selector for selecting a queue from the plural- 
ity of queues as a function of a plurality of queue 
state variables and transmitting queue selec- 
tion infomnation; and 
10 a driver for receiving the queue selection Infor- 

mation and controlling release of data from the 
selected queue. 

15. The scheduling apparatus according to claim 14, 
'5 wherein the queue state variables Include backlog. 

16. The scheduling apparatus according to claim 14, 
wherein the queue state variables include band- 
width availability. 

20 

17. The scheduling apparatus according to claim 14, 
wherein the queue state variables include priority. 

18. The scheduling apparatus according to claim 14, 
25 further comprising: 

a manager for updating the status of the plu- 
rality of queues with respect to one or more of the 
queue state variables. 
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